Statistical Workloads for Energy Efficient MapReduce
نویسندگان
چکیده
Energy efficiency is a growing concern in modern datacenters. As Internet services increasingly rely on MapReduce workloads to fuel their flagship businesses, there is a growing need for better MapReduce energy efficency evaluation mechanisms. We present a statistics-driven workload generation framework that distills summary statistics from production MapReduce traces and realistically reproduces representative workloads. These workloads help us evaluate design decisions with regard to scale, configuration, scheduling, and other issues. We use this framework to identify specific suggestions to improve MapReduce energy efficiency. Our key finding is that evaluations using trace-driven workloads reverse current design priorities in optimizing for data intensive synthetic jobs.
منابع مشابه
Towards Energy Efficient MapReduce
Energy considerations are important for Internet datacenters operators, and MapReduce is a common Internet datacenter application. In this work, we use the energy efficiency of MapReduce as a new perspective for increasing Internet datacenter productivity. We offer a framework to analyze software energy efficiency in general, and MapReduce energy efficiency in particular. We characterize the pe...
متن کاملA Performance Study of Big Data on Small Nodes
The continuous increase in volume, variety and velocity of Big Data exposes datacenter resource scaling to an energy utilization problem. Traditionally, datacenters employ x8664 (big) server nodes with power usage of tens to hundreds of Watts. But lately, low-power (small) systems originally developed for mobile devices have seen significant improvements in performance. These improvements could...
متن کاملEnergy Efficiency for MapReduce Workloads: An In-depth Study
Energy efficiency has emerged as a crucial optimization goal in data centers. MapReduce has become a popular and even fashionable distributed processing model for parallel computing in data centers. Hadoop is an open-source implementation of MapReduce, which is widely used for short jobs requiring low response time. In this paper, we conduct an indepth study of the energy efficiency for MapRedu...
متن کاملReducing Cluster Energy Consumption through Workload Management
Energy consumption is a major and costly problem in data centers. For many workloads, a large fraction of energy goes to powering idle machines that are not doing any useful work. There are two causes of this inefficiency: low server utilization and a lack of power proportionality. We focus on addressing this problem for two workloads: (1) a traditional, front-end web server workload and (2) an...
متن کامل